The Predictive Value of Time-Varying Noninvasive Scores on Long-Term Prognosis of NAFLD in South Korea

Background This study aimed to examine whether repeated measurements on noninvasive fibrosis scores during follow-up improve long-term nonalcoholic fatty liver disease (NAFLD) outcome prediction. Methods A cohort study of 2,280 NAFLD patients diagnosed at the Seoul National University Hospital from 2001 to 2015 was conducted. Multivariable Cox regression models with baseline and designated time-point measurements of the fibrosis-4 index (FIB-4) and NAFLD fibrosis score (NFS) were used to assess the association between these scores and overall mortality, liver-related outcomes, and cardiovascular events. Results Higher baseline NFS (high versus low probability for advanced fibrosis groups) was associated with higher risk of mortality (adjusted hazard ratio (aHR), (95% confidence interval (CI)), 2.80, [1.39–5.63]) and liver-related outcomes (3.70, [1.27–10.78]). Similar findings were observed for the association of baseline FIB-4 with mortality (2.49, [1.46–4.24]) and liver-related outcomes (11.50, [6.17–21.44]). In models considering designated time-point measurements of the scores, stronger associations were noted. For NFS, a higher time-point measurement was associated with a significantly higher risk of mortality (3.01, [1.65–5.49]) and liver-related outcomes (6.69, [2.62–17.06]). For FIB-4, higher time-point measurements were associated with significantly higher mortality (3.01, [1.88–4.82]) and liver-related outcomes (13.26, [6.89–25.53]). An annual increase in FIB-4 (2.70, [1.79–4.05]) or NFS (4.68, [1.52–14.44]) was associated with an increased risk of liver-related outcomes. No association between NFS/FIB-4 and risk of cardiovascular events was observed in both models. Conclusions Higher aHRs describing the associations of FIB-4/NFS with overall mortality and liver-related outcomes were observed in the models that included designated time-point measurements of the scores. In addition to the baseline measurement, a routine monitoring on these scores may be important in predicting prognosis of NAFLD patients.

Background.Tis study aimed to examine whether repeated measurements on noninvasive fbrosis scores during follow-up improve long-term nonalcoholic fatty liver disease (NAFLD) outcome prediction.Methods.A cohort study of 2,280 NAFLD patients diagnosed at the Seoul National University Hospital from 2001 to 2015 was conducted.Multivariable Cox regression models with baseline and designated time-point measurements of the fbrosis-4 index (FIB-4) and NAFLD fbrosis score (NFS) were used to assess the association between these scores and overall mortality, liver-related outcomes, and cardiovascular events.Results.Higher baseline NFS (high versus low probability for advanced fbrosis groups) was associated with higher risk of mortality (adjusted hazard ratio (aHR), (95% confdence interval (CI)), 2.80, [1.39-5.63])and liver-related outcomes (3.70, [1.27-10.78]).Similar fndings were observed for the association of baseline FIB-4 with mortality (2.49, [1.46-4.24])and liverrelated outcomes (11.50, [6.17-21.44]).In models considering designated time-point measurements of the scores, stronger associations were noted.For NFS, a higher time-point measurement was associated with a signifcantly higher risk of mortality (3.01, [1.65-5.49])and liver-related outcomes (6.69, [2.62-17.06]).For FIB-4, higher time-point measurements were associated with signifcantly higher mortality (3.01, [1.88-4.82])and liver-related outcomes (13.26, [6.89-25.53]).An annual increase in FIB-4 (2.70, [1.79-4.05])or NFS (4.68, [1.52-14.44])was associated with an increased risk of liver-related outcomes.No association between NFS/FIB-4 and risk of cardiovascular events was observed in both models.Conclusions.Higher aHRs describing the associations of FIB-4/NFS with overall mortality and liver-related outcomes were observed in the models that included designated time-point measurements of the scores.In addition to the baseline measurement, a routine monitoring on these scores may be important in predicting prognosis of NAFLD patients.

Introduction
Nonalcoholic fatty liver disease (NAFLD) is the most common liver diseases worldwide [1].Te overall global prevalence of NAFLD is 30.1% based on meta-analysis of 92 population studies from 1990 to 2019 [2,3].
Although highly prevalent, most patients with NAFLD do not develop clinically relevant liver disease [4,5].A subset of patients with NAFLD develops advanced fbrosis, with risk of progression to cirrhosis and hepatocellular carcinoma (HCC), which can lead to liver-related morbidity and mortality [6][7][8].Te most important prognostic factor for liver-related diseases and mortality in NAFLD is the fbrosis stage [9].Previous longitudinal studies have shown that advanced fbrosis (i.e., stage 3-4 by liver biopsy) was the major predictor of clinically signifcant liver-related outcomes [10].Individuals without advanced fbrosis had a lower risk of progression to cirrhosis within a 10-15-year time frame [11], whereas those with advanced fbrosis more frequently experienced severe liver-related endpoints and had higher overall mortality [12].However, as an invasive procedure, liver biopsy is not routinely performed, and serial repeated follow-up with liver biopsies is less common.Because of that, various scoring systems have been developed to identify patient with a high probability of advanced fbrosis based on data routinely collected in clinical setting, and fbrosis-4 index (FIB-4), aspartate aminotransferase to platelet ratio index (APRI), and NAFLD Fibrosis score (NFS) are among those widely used [13].
Population-based observational studies have found that higher NFS, FIB-4, and APRI scores were associated with increased risk of liver disease and overall mortality among patients with NAFLD [14][15][16][17].Furthermore, a retrospective cohort study found that an increase in the FIB-4 over time was associated with higher risk of severe liver disease while a decrease in the FIB-4 was associated with reduced risk [18].However, it remains unclear if the repeated measurements on these scores can improve the prediction of long-term clinical outcomes.
In this study, we evaluated prognosis in 2,280 NAFLD patients with a median follow-up of over 10 years and assessed whether repeated measurements on NFS, FIB-4, and APRI could enhance the prediction of overall mortality, liver-related outcomes, and cardiovascular events.

Patients.
A retrospective cohort study of adult NAFLD patients diagnosed from 2001 to 2015 at the Seoul National University Hospital (SNUH) was conducted.NAFLD was initially defned as patients with fatty liver and without signifcant alcohol intake (≥210 g per week for men and ≥140 g per week for women).Fatty liver was identifed if liver echogenicity surpassed that of the renal cortex and spleen, accompanied with ultrasound wave attenuation, loss of diaphragm defnition, and poor delineation of intrahepatic architecture [19].In the absence of ultrasound data, precontrast computed tomography pictures were employed.If the liver's attenuation was at least 10 Hounsfeld units smaller than that of the spleen or 40 Hounsfeld units, fatty liver was found [20].Patients with chronic hepatitis B and chronic hepatitis C were not included.NAFLD patients who met any of the following criteria were also excluded: prior history of liver cirrhosis, prior history of medication known to cause fatty liver, prior history of liver transplantation, prior history of cancer, prior history of cardiocerebrovascular disease, prior history of human immunodefciency virus infection, prior history of severe thrombocytopenia, early malignancy event or cirrhosis development, and patients with incomplete laboratory data (Figure 1).Baseline characteristics of included and excluded NAFLD patients are summarized in Supplementary Table 1.Recently, the focus has shifted from defning NAFLD to emphasizing metabolic-associated aspects, leading to the frequent use of the term metabolic dysfunction-associated steatotic liver disease (MASLD).Tis term avoids stigmatizing language such as "nonalcoholic" and "fatty liver" [21].Consequently, in this study, we have identifed a subpopulation of NAFLD patients who meet the criteria for MASLD.Te Institutional Review Board of Seoul National University Hospital approved the study protocol, and it was conducted in accordance with the Declaration of Helsinki (approval no.: H-2104-044-1210).Because of the retrospective nature of this study, informed consent was waived.

Outcomes and Variables
. Te primary outcomes were overall mortality, liver-related outcomes, and cardiovascular events.Overall mortality was determined based on database from the Ministry of the Interior and Safety of Korea.Liverrelated outcomes included cirrhosis, decompensated cirrhosis, liver transplantation, and HCC.Any history of admission or visiting outpatient department more than once with the diagnosis of cirrhosis with the following was defned as cirrhosis: (1) histological fndings, (2) ultrasonographic fndings (nodules in hepatic parenchyma, splenomegaly (>12 cm), or enlarged portal vein (>16 mm)), or (3) endoscopic fndings compatible to cirrhosis.Decompensated cirrhosis was defned as new onset of clinically obvious ascites, overt encephalopathy, or variceal hemorrhage [22].HCC was defned according to American association of the study of liver diseases [23] and European association of the study of liver HCC guidelines [24].Patients newly diagnosed with acute coronary syndrome (ACS), heart failure (HF), stroke, or peripheral arterial disease (PAD) were considered cardiovascular events.All these cardiovascular events were diagnosed according to their clinical criteria, and their outcomes were initially screened through diagnosis codes (international classifcation of diseases (ICD)-10 codes I21, I50, I63, and I73.9) and confrmed by physicians' review of electronic medical records.
Te index date was defned as the time of the frst diagnosis of NAFLD during the study period.Follow-up was ended at the time of an event, emigration, death, or end of follow-up (December 31, 2021), whichever came frst.Patients were also censored at the time when signifcant alcohol intake was documented during the follow-up period.Patients' demographic information and comorbidities were collected at the baseline.Repeated measurements of FIB-4, APRI, and NFS were collected at follow-up visit in year 2 and year 4. Calculations for FIB-4, APRI, and NFS and categorization for low, intermediate, and high probability of advanced fbrosis were described elsewhere [25][26][27].Specifcally, for FIB-4, two cutof points were selected to categorize subjects into 3 groups: low (age under 60, FIB-4 <1.30; age over 60, FIB-4 <2.00), intermediate (age under 60, FIB-4: 1.30-2.67;age over 60, FIB-4: 2.00-2.67),and high (FIB-4 ≥2.67) probability of advanced fbrosis [27].For APRI, two cutof points were selected to categorize subjects into three groups as follows: low (APRI <0.5), intermediate (APRI: 0.5 to <1.5), and high (APRI ≥1.5) probability of advanced fbrosis [28].For NFS, two cutof points were selected to categorize subjects into 3 groups as follows: low (NFS < −1.455), intermediate (NFS: −1.455 to < 0.676), and high (NFS ≥0.676) probability of advanced fbrosis [26].Annual changes in FIB-4, NFS, and APRI scores were analyzed to assess their impact on main clinical outcomes.

Statistical Analysis.
Patients' characteristics were described using means with the standard deviations (SD), median with interquartile range (IQR), or as total numbers with percentages where applicable.Te cumulative  Canadian Journal of Gastroenterology and Hepatology incidence of outcomes was calculated as the number of events divided by the number of person-years during the study period.For patients with missingness in laboratory data, we conducted multiple imputations for missing values under the assumption that the missing data were random using the multiple imputation with chained equation (MICE) method to analyze baseline and designated timepoint measurements of NFS, FIB-4, and APRI as part of the sensitivity analysis [29].
Multivariable Cox regression models incorporating baseline and designated time-point measurements of NFS and FIB-4 were separately ft to the data to assess the association between the measurements of these scores at baseline/designated time-points and the primary outcomes.Te variables included in the multivariable Cox regression models, besides FIB-4, NFS, and APRI, were age, sex, baseline body mass index (BMI) [30], baseline type II diabetes [31], baseline hyperlipidemia [32], and baseline hypertension [33], which are commonly known as prognostic risk factors of NAFLD.Te variables were predetermined based on existing literature.Analyses were performed using R 4.2.0 (R Foundation for Statistical Computing, Vienna, Austria).All statistical tests were two sided, and to account for multiple comparisons (n � 54), Bonferroni correction was applied.Statistical signifcance was determined at P < 0.001 after correction.

Baseline Characteristics. 2,280 individuals diagnosed with NAFLD between January 2001 and December 2015
were eligible for inclusion in the study (Figure 1).Table 1 summarizes the characteristics of patients.Te median follow-up was 10.9 years (IQR: 8.3-15.5).Te mean age was 55.1 years, and 56.9% were overweight or obese (BMI >25).Te mean scores for the NFS and FIB-4 were −1.27 and 1.27, respectively.When MICE was performed, similar results were reproduced (Supplementary Table 2).2, the cumulative mortality was 0.82 cases per 100 person-years.
In the model with designated time-point measurements of scores considered, higher aHRs were observed.Patients with higher designated time-point measurements NFS had  3).No association of NFS or FIB-4 with risk of death was observed for the comparison between intermediate and low probability of advanced of fbrosis groups, in both models.We observed similar results when missing values were imputed (Table 4).
No association of NFS or FIB-4 with risk of death was observed for the comparison between intermediate and low probability of advanced of fbrosis groups, in both models.2, during follow-up, 10 HCC cases and 1 liver transplantation occurred.126 patient developed cirrhosis, with a cumulative incidence of 0.50 cases per 100 person-years.

Liver-Related Outcomes. As shown in Table
In the model with baseline value of scores considered, patients with higher baseline NFS tend to have a higher risk of liver-related outcomes (high versus low probability of advanced fbrosis groups, aHR � 3.79, 95% CI � 1.32-10.92,and P � 0.01).No relationship between NFS and the risk of liver-related outcomes was found when comparing the intermediate and low probability groups for advanced fbrosis, or when NFS was analyzed as a continuous variable.Similarly, for FIB-4, a higher baseline value correlated with a higher risk of liver-related outcomes (as continuous variables, aHR � 1.35, 95% CI � 1.20-1.53,and P < 0.001; high versus low probability of advanced fbrosis groups, aHR � 12.08, 95% CI � 6.41-22.77,and P < 0.001; intermediate versus low probability of advanced fbrosis groups, aHR � 1.91, 95% CI � 1.11-3.27,and P � 0.02).No statistically signifcant relationship between FIB-4 and the risk of liver-related outcomes was found when comparing the intermediate and low probability groups for advanced fbrosis.Higher baseline APRI values were also associated with an increased risk of liver-related outcomes (as continuous variables, aHR � 1.35, 95% CI � 1.13-1.61,and P < 0.001; high versus low probability of advanced fbrosis groups, aHR � 10.98, 95% CI � 4.99-24.15,and P < 0.001; intermediate versus low probability of advanced fbrosis groups, aHR � 3.81, 95% CI � 2.34-6.18,and P < 0.001) (Table 3).
In the model with designated time-point measurement values of scores considered, stronger associations than those in the models with only baseline value considered were observed.Patients with higher NFS at designated timepoints had an increased risk of liver-related outcomes (high versus low probability of advanced fbrosis groups, aHR � 6.55, 95% CI � 2.46-17.42,and P < 0.001).No association between NFS and liver-related outcomes was observed for comparisons between intermediate and low risk of advanced fbrosis groups or when NFS was considered as a continuous variable.For FIB-4, higher designated timepoint measurement values was associated with an increased risk of liver-related outcomes (as continuous variables, aHR � 1.38, 95% CI � 1.26-1.52,and P < 0.001; high versus low probability of advanced fbrosis groups, aHR � 13.65, 95% CI � 6.84-27.27,and P < 0.001; intermediate versus low probability of advanced fbrosis groups, aHR � 1.94, 95% CI � 1.14-3.33,and P � 0.01).For APRI, higher designated time-point measurement values were associated with an increased risk of liver-related outcomes (as continuous variables, aHR � 1.78, 95% CI � 1.48-2.14, and P < 0.001; high versus low probability of advanced fbrosis groups, aHR � 11.73, 95% CI � 5.02-27.45,and P < 0.001; intermediate versus low probability of advanced fbrosis groups, aHR � 4.31, 95% CI � 2.69-6.90, and P < 0.001) (Table 3).We observed similar results when missing values were imputed (Table 4).2, cumulative incidence of ACS, HF, stroke, and PAD were 0.16, 0.24, 0.73, and 0.26 per 100 person-years, respectively.Tere does not appear to be an association between NFS/FIB-4 and the risk    1).Te outcomes replicated were nearly identical to those observed in NAFLD (Supplementary Table 2).

Discussion
In this retrospective cohort study of 2,280 NAFLD patients with a median follow-up of over 10 years, we assessed the association between NFS/FIB-4/APRI and risk of death, liverrelated outcomes, and cardiovascular events, with considering the longitudinal changes of these scores.We found that patients with higher designated time-point measurements of NFS/FIB-4/APRI scores had a higher risk of liver-related outcomes compared to those with lower scores.Signifcant associations were observed in the designated time-point measurements of NFS, FIB-4, and APRI after adjusting for multiple comparisons using the Bonferroni correction.Tere does not appear to be an association between NFS/FIB-4/APRI and the risk of cardiovascular events in both models.Following recent defnitions, it has been demonstrated that NFS/FIB-4/APRI scores provide consistent results in patients with MASLD as well as NAFLD, suggesting their potential utility in future studies.NFS, FIB-4, and APRI were among commonly used noninvasive tests to rule out advanced fbrosis in at-risk groups [34].Higher baseline values were associated with a higher risk of all-cause mortality and liver-related outcomes [34,35].With repeated measurement available in our cohort, we were able to use advanced statistical approach to evaluate the prognostic value of the designated time-point measurements scores on long-term clinical outcomes.As compared to baseline measurement, scores measured more recently may be more relevant to the onset of events and better predict the patient's risk of future events.We observed a signifcant increase in risk of death and liver-related outcomes comparing patients with high probability of advanced fbrosis to those with low probability.Te diference in risk of outcomes when comparing intermediate to low probability groups was not as signifcant, suggesting that those with higher probability of advanced fbrosis were more likely to experience outcomes.Unlike liver biopsy, the parameters used to calculate these scores are routinely collected and easily accessible in clinical setting, making regularly monitoring on these scores and risk-stratifcation of patients in regard to long-term prognosis possible to inform targeted interventions.
NAFLD is a multisystem disease, and its clinical burden is not limited to the liver.NAFLD is closely associated with subclinical markers of cardiovascular disease and advanced fbrosis in NAFLD patients was reported to be an independent cardiovascular risk factor [36,37].As easy to calculate and widely available scores to quantify the probability of advanced fbrosis, NFS, FIB-4, and APRI have the potential as a prognostic score for cardiovascular risk [38,39].In our study, a positive association between the designated time-point measurements of NFS, FIB-4, and APRI and cardiovascular risk was observed, but the results were not statistically signifcant.Previous studies have reported inconsistent fndings regarding the association between these scoring systems and cardiovascular outcomes [34,38,39].Te discrepancy in the fndings may be attributed to the diference in sample size, defnitions of cardiovascular events, inclusion/exclusion criteria of the study population, statistical modelling, and management of efect modifers in the analyses.
We estimated incidence of primary outcomes in our NAFLD cohort.In a previous study conducted in the US, the incidence of HCC in western population with no cirrhosis was 0.03 cases per 100 person-years [40].Tis is in line with our study fndings.Another study in Sweden [41] found that NAFLD patients had a higher risk of overall mortality and liver-related outcomes than non-NAFLD patients, but the incidence of overall mortality and liver-related outcomes were higher than our current study.Tis diference may be explained by underlying diference in study populations or ascertainment methods on outcomes.Of note, direct comparison might be challenging due to the variation in the severity of NAFLD and other baseline variables.
Tis study is the frst to evaluate the predictive value of designated time-point measurements of NFS, FIB-4, and APRI on the long-term prognosis of NAFLD patients in South Korea.However, limitations of this study should be noted.First, although noninvasive fbrosis indices such as NFS, FIB-4, and APRI have been validated in a heterogeneous group of NAFLD patients and have relatively good accuracy in detecting advanced fbrosis (F3 or F4), their components include routine blood tests that can be afected by other factors other than the degree of liver fbrosis, hence decreasing their accuracy.
Second, although higher risk estimates were observed in models that considered designated time-point measurements, the magnitude of diference compared to the models with baseline values was small, likely because the follow-up measurements on the scores were taken close to the baseline.We excluded patients with incomplete follow-up, potentially introducing selection bias and limiting the external validity of the study.Future longitudinal studies are warranted to validate our study fndings.In addition, for patients with follow-up but incomplete laboratory data, we utilized multiple imputations to address missingness in our analysis, assuming it to be missing at random.However, it is important to acknowledge that there is a possibility that the missingness is not at random, which could introduce bias to the results.Lastly, this investigation was conducted retrospectively at a single tertiary center, and results of this study may not be generalized to patients with diferent characteristics.
In conclusion, this retrospective cohort study demonstrated that baseline and designated time-point measurements values of FIB-4, NFS, and APRI were signifcantly associated with liver-related outcomes in NAFLD or MASLD patients.Te results indicate that routine monitoring of these scores in addition to the baseline measurement may better predict the prognosis of NAFLD or MASLD patients.Further studies are needed to validate the fndings in diverse populations.the MASLD patients.Supplementary Table 4: multivariable Cox regression of scoring systems on risk of primary outcomes in MASLD patients.(Supplementary Materials)

Table 2 :
Incidence of primary outcomes during the follow-up.One patient may have multiple events.ACS, acute coronary syndrome; HCC, hepatocellular carcinoma; HF, heart failure; IQR, interquartile range; PAD, peripheral artery disease.

Table 3 :
Multivariable Cox regression of scoring systems on risk of primary outcomes.

Table 4 :
Multivariable Cox regression of scoring systems on risk of primary outcomes (imputed data).Canadian Journal of Gastroenterology and Hepatology of cardiovascular events, for the comparison between high versus low or intermediate versus low probability of advanced fbrosis groups, in both models (Table3).